GSoC23 — Workweek 10
Introduction
This week I have for you a bug hunt of the special kind: Modpath delays enabled by themselves work, interconnection delays as well, but when enabled together together, the modpath delays are ignored.
And as the second topic this week, I'd like to present the interconnect test suite to track the progress of my work on this feature.
Modpath issue
I was trying to enable both interconnection delays and specify paths at the same time when I noticed that the modpath delays annotated by the specify blocks suddenly did not work anymore.
After some troubleshooting, it turned out that this issue surfaces when a module has input buffers (which are needed for SDF INTERCONNECT support) in its local scope. When the input buffer is moved to the outer scope, all works as expected.
This seems strange, since the netlist should be equivalent in both cases, but still shows different behavior. I was even able to reproduce this issue with the current master.
To document this behavior, I created issue #985
It took me some time, but I was able to track down the root cause of the issue.
The general concept of modpaths in Icarus Verilog is based on the vvp_fun_modpath_src
and vvp_fun_modpath
functors. The netlist for a simple buffer with a modpath could look like this:
output
|--- vvp_fun_bufz --- vvp_fun_modpath ----------
input |
-----------|
|
|--- vvp_fun_modpath_src
The vvp_fun_bufz
provides the actual functionality for the buffer, vvp_fun_modpath_src
and vvp_fun_modpath
are there to provide the correct delay from input to output.
What actually should happen (a bit simplified):
- The input value changes
- This triggers
vvp_fun_modpath_src
, it stores the current simulation time aswake_time_
- The input value propagates through the buffer and to
vvp_fun_modpath
vvp_fun_modpath
looks up which of itsvvp_fun_modpath_src
s are active and calculatse the wakeup time using thewake_time_
of thevvp_fun_modpath_src
plus its delay. If this wakeup time is greater than the current simulation time, subtract the current simulation time from it. If no delays are between input andvvp_fun_modpath
then this value should be the delay value of the selectedvvp_fun_modpath_src
.- Finally, schedule the output value change with the calculated delay.
It is crucial that vvp_fun_modpath_src
is evaluated before vvp_fun_modpath
. But it seems that this was not the case in this instance.
In the working variant with the input buffer in the outer scope, the functors are connected to input
in the following order:
vvp_fun_modpath_src
vvp_fun_bufz
If the buffer is moved into the inner scope, for some reason the order in which these functors are connected to the input is changed:
vvp_fun_bufz
vvp_fun_modpath_src
Since this seems to be the order in which the netlist is evaluated and vvp_fun_modpath
is connected to vvp_fun_bufz
, this causes vvp_fun_modpath
to be evaluated before its vvp_fun_modpath_src
. Thus wake_time_
of vvp_fun_modpath_src
is still the previous value and the sum of wake_time_
plus the delay of vvp_fun_modpath_src
is less than the current simulation time. Therefore, the delay is set to zero, and it appears that the specify delays are ignored during simulation.
So much for the root cause, bot how can we solve this problem? We need to make sure that vvp_fun_modpath_src
s and vvp_fun_modpath
are evaluated in the correct order.
The obvious solution would be to ensure that vvp_fun_modpath_src
s are always connected before any other functors. Another solution would be to make sure that vvp_fun_modpath_src
s are evaluated first even when the order is not correct. This could be achieved by prioritization in the scheduler, but this would result in a runtime loss. Therefore, I prefer the first option as a solution, as does my mentor.
Tests & Script
Before any new feature can be merged into the codebase, it is necessary to ensure that no regressions have occurred and that the feature works as it should.
The first point is covered by Icarus' regression test suite - a collection of tests covering various aspects of the Verilog language.
So how can we be sure that our interconnection delays are annotated correctly? Well, until now I did this by manually by comparing the simulation waveforms with the expected delays as stated in the SDF file. This is tedious work and not feasible for large designs with thousands of gates. So what did I do? I automated my job of checking the waveforms by writing a simple Python script.
To have some interesting designs for the test suite, I ran two simple designs alu_example
and counter_example
through the OpenLane flow with the SKY130 PDK to get to the gate level representation and the SDF file. Next, both GL designs are simulated with -ginterconnect
enabled. Finally, my script is executed: It reads the waveforms in .vcd
format and uses the (INTERCONNECT ...)
statements from the SDF file to verify that the signals transition occur according to the specified delays. At the end you get a summary with information about how many interconnection delays were simulated correctly.
╔═══════════════════════════════════════════╗
║ Simulation Summary ║
╠═══════════════════════════════════════════╣
║ VCD File: alu_example.vcd ║
║ SDF File: alu_example.sdf ║
║ Instance: top.alu_example_inst ║
╠═══════════════════════════════════════════╣
║ Number of Interconnects: 139 ║
║ ✅ Number of Successes: 126 ║
║ ❌ Number of Failures: 13 ║
║ Success Ratio 90.65 % ║
╚═══════════════════════════════════════════╝
As you can see, not all interconnects are annotated correctly yet, but it's a good start!
I bundled everything into a repository so that you can try it out yourself: interconnect-tests
This repository will be part of my deliverables for the final project evaluation.
Summary
Now that the interconnect feature becomes more and more complete, even more issues arise. But I have new tools to systematically test designs after each change to the the source code, which helps me find these problems.
Are you excited to see what's new next week? Me too :)